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Abstract. The Gaia mission is expected to provide highly accurate astrometric, photomet- 
ric, and spectroscopic measurements for about 10' objects. Automated classification of de- 
tected sources is a key part of the data processing. Here a few aspects of the Gaia classifica- 
tion process are presented. Information from other surveys at longer wavelengths, and from 
follow-up ground based observations will be complementary to Gaia data especially at faint 
magnitudes, and will offer a great opportunity to understand our Galaxy. 
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1. Introduction 

ESA's Gaia mission, to be launched in 2011, 
is meant to obtain accurate position, parallax 
and proper motion for 10 9 object all over the 
sky, up to magnitude G-2Q with an astromet- 
ric accuracy at the //arcsec level. The low- 
dispersion spectroscopy obtained in the BP and 
RP passbands (330-1000 nm, resolution of 3— 
30 nm) will be used not only to correct the as- 
trometry for color effects, but also to obtain a 
characterization of the sources themselves. The 
Radial Velocity Spectrograph (RVS, 840-890 
nm, Rp=l 1 500) will measure radial velocities, 
with a precision of few km s _1 , up to G-16. 
Gaia will observe the whole sky for five years 
(plus a possible year extension) achieving a 
mean of ~ 80 observations for each source. 
The final catalog will be available in 2020, pre- 
ceded by an early data release. The data reduc- 
tion is a great challenge: the size of Gaia re- 
lated data will be about 10 15 bytes, while the 
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final data delivered to the community would 
be of about 20 TB. The estimated computa- 
tion size will be of t he order of 10 21 Flops (see 
Mign ard et al 20081) . 



2. The classification of Gaia objects 

Gaia does not use a full sky input Catalog. 
However, in the earl y stages of the miss ion, the 
Guide Star Catalog (lLasker et al 2008b (GSC- 
II)will be used as inputs for the initial source 
list to support the identification of the objects. 
The GSC-II is constructed from the scanned 
images of the Palomar and UK Schmidt photo- 
graphic survey digitized at the Space Telescope 
Science Institute. It makes use of Tycho-2 
data as reference for the astrometric calibra- 
tion. This is a good example of synergy be- 
tween ground based surveys and spatial mis- 
sions. Outside the Galactic plane the Catalog 
is complete down to Rf ~ 20, but stellar clas- 
sification is reliable at 90% level at Rf ~ 19.5. 
Coordinates are provided with a mean preci- 
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sion of 0."2 -0."28 for about 9 x 10 8 objects. 
The final Gaia catalog is expected to provide 
positions, parallaxes, proper motions, radial 
velocities, photometry in the two broad bands 
BP/RP, discrete classification of sources, astro- 
physical parameters (APs) for single stars (i.e. 
T e ff, logg,...), and possibly the parametrization 
of special sources (galaxies, QSOs). To avoid 
biases, and to built a reference frame for as- 
trometry, Gaia will re-classify the observed ob- 
jects. Such a large amount of data can be clas- 
sified only in an automated way. In the Gaia 
project, the classification algorithms are based 
on bot h supervised a nd unsupervised methods 
(see ISmith et al 20081) . first producing a dis- 
crete classification of the objects, i.e. divid- 
ing objects having higher probability of being 
stars, galaxies, and QSOs, then estimating its 
APs by comparison with a set of templates. 
Finally, the treatment of the outliers will relay 
on unsupervised methods. The scientific com- 
munity involved in Gaia is working to calculate 
extensive libraries of synthetic and observa- 
tional templates with improved physics for all 
the classes of objects to be used as training data 
for the classification algorithms. In the classifi- 
cation task, Gaia data should be complemented 
by external (i.e. non-Gaia) information. The 
simplest way to do this is via astrometric cross 
matching to existing catalogs. The most obvi- 
ous candidates are the spectral and/or morpho- 
logical classifications from SDSS and 2MASS, 
later also UKIDSS and PS1. FIRST could also 
be useful for the QSOs. This information could 
easily be introduced in a multi-component dis- 
crete classifier by means of the introduction of 
priors as pre-data esti mate that a object belongs 
to a given class (see iBailer- Jones et al 20081 
for a detailed description of the method). 

3. Training data for object 
classification 

The Gaia object classification includes as 
well the determination of the APs of stars and 
possibly galaxies. As we state in the previous 
Section, supervised methods require the com- 
parison with a set of templates, either observed 
or synthetic, as training data sets. While 
observational programs have already started 



to built a homogeneous sample of stellar 
templates, it is clear that training data cannot 
be purely observational, since a large variety 
and uniform coverage is requested for the 
parameter distributions. It turns out that high 
resolution and high quality synthetic libraries 
are of fundamental importance. New extensive 
calculations of sets of spectral stellar libraries 
with improved physics are on the way. They 
cover the two Gaia spectral ranges: 300-1100 
nm at 0. 1 nm resolution, and 840-890 nm at 
0.001 nm resolution. These new libraries span 
a large range in atmospheric parameters, from 
super-metal-rich to very metal-poor stars, from 
cool stars to hot, from dwarfs to giant stars, 
with small steps in all parameters, typically 
Ar eff =250 K (for cool stars), Alogg=0.5 
dex, A[Fe/H]=0.5 dex. Depending on T e g, 
these libraries rely o n MARCS (F,G,K stars: 
IGustafsson et al 20081). PHOENI X (cool and 
C stars: iBrott & Hauschildt 20051) . KURUCZ, 
TLUSTY models including magnetic field, pe- 
culiar abundances, ma ss los s (A,B,Be,Q stars: 
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Sordo & Munari 2006; 

Ulvarez & Plez 19981: iKochukhov et al 20051) . 
WDA and WDB o bjects are included 
dCastanheira et al 20061) . Those models 
are based on different assumptions: KURUCZ 
are LTE, plane-parallel, MARCS implement 
also spherical symmetry while PHOENIX and 
TLUSTY (hot stars) can calculate NLTE mod- 
els both in plane-parallel mode and spherical 
symmetry (see for a more detailed discu ssion 
IGustafsson et al 20081: ISordo et al 2"008h . A 
comparable effort is carried on in the galaxy 
domain. We remind that Gaia will extend the 
existing surveys of galaxies (see or instance 
the SDSS covering only a fifth of the sky) 
since it will be able to detect about 10 7 unre- 
solved galaxies down to G=20 covering the 
whole sky for the first time since photographic 
surveys (UK, ESO, Palomar Schmidt) of 30 
years ago in a larger spectral range. Large 
synthetic libraries of galaxy spectra covering 
the main Hubble types in the Gaia spectral 
range at a sampling of 1 nm or less are under 
construction (see iTsalmantza et al 20071) . At 
present, a library of about 3800 galaxy at zero 
redshift, and a second one of about 140,000 
spectra at changing redshift are available. 
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Finally, QSOs synthetic and semi-empirical 
libraries dClaeskens et al 2 006) are calculated. 

4. QSO classification: Gaia reference 
frame 

A high precision reference frame in Gaia is ob- 
viously mandatory to reach a high accuracy of 
10 fias in the astrometry. With such a request, 
the astrometry must be self-calibrating and for 
this reason Gaia must observe a large num- 
ber of quasars to define a high precision refer- 
ence frame. This quasar sample must be very 
clean, showing only a low contamination by 
other objects. A probabilistic classifier is built 
to select objects with higher probability. Care 
is paid to construct a pure sample of objects, 
discarding for instance QSOs with low equiv- 
alent width emission lines which can be easily 
confused with F-G-K stars (4000-8000 K) hav- 
ing high extinction (Ay ~ 8 - 10). Preliminary 
results show that a pure sample of QSOs at 
65% level complete down to G=20 can be se- 
lected. This is adeguate to establish a refer- 
ence frame for Gaia: Gaia will observe 500 000 
QSOs brighter than G=20, but only the ob- 
jects with the most accurate positions (G < 
18) will be used to built a reference frame. 
Following our preliminary estimates, a sample 
of 250 000 can be re covered with no more th an 
13 contaminats (see Baile r- Jones et al 2008). It 
is clear that the Gaia reference frame needs 
to be aligned with the International Celestial 
Reference Frame (ICRF) with the highest ac- 
curacy. The ICRF is based on VLBI positions 
of about 700 extragalactic radio sources. For 
this reason an observational program is ini- 
tiated at the VLBI to identify suitable radio 
sources to align the two reference systems. At 
the moment, only a few objects can be use- 
ful to this purpose, either because they are not 
bright enough at optical wavelengths, or be- 
cause they have extended emissions in the ra- 
dio which precludes to reach the reque sted as- 
trometric accuracy dBourda et al 2008b . 

5. Gaia and complementary surveys 

Gaia will be of fundamental importance to 
study the Galactic structure at low latitudes: the 



position and the velocity of the OB stars con 
be measured without assuming rotation curve 
or extinction. The distances of OB stars at 4 
Kpc with Av= 4 mag extinction will be de- 
rived with an accuracy of 13%, space veloci- 
ties with an accuracy of a few Km/s. Fainter 
stars can be measured as well, giving impor- 
tant information about the mass distribution. 
However it should be noticed that at faint mag- 
nitudes (G ~ 20), the expected accuracy is de- 
graded and stellar APs cannot be reliably de- 
termined. Once that a three-dimensional map 
of the Galactic plane is derived, and that dis- 
tances and kinematics are known for all the 
star forming regions, we will be able to trace 
the disk and spiral structure of the Galaxy. 
The challenge will be to built dynamical the- 
ories to reproduce the density fields and the 
velocity fields. To distinguish between m=2 
and m=4 spiral arm structure recent simula- 
tions find out that the potential needs to be 
known to 10%, the line-of-sight velocity ac- 
curacy needs to be better than 20 Km/s, dis- 
tances shoud be known with uncertaintie s bet- 
ter than 30% ( Minchev & Ouillen 2008b . This 
is well within the possibility limits of the Gaia 
survey. However, two main problems should 
be reminded: the first is related to the dust ob- 
scuration, which might hamper the determina- 
tion of the star mass density, while the sec- 
ond is due to the confusion of the stars in 
the field of view (FOV). Concerning extinc- 
tion, its knowledge will become a limiting fac- 
tor in the determination of the stellar luminosi- 
ties and APs. The estimate made on the basis 
of G, BP, RP photometry only present some 
degree of degeneration: it is difficult to de- 
rive both the extinction and the extinction law 
for late type giants. Using RVS information, 
both parameters can be derived. In addition, 
the use diffuse interstellar bands (DIBs) are ex- 
tinction tracers will be explored. The strongest 
expected DIB in the surveyed range is at 862 
nm. Its equivalent width well correlates wit h 
the interstellar reddening dMunari et al 2008). 
A large effort in going on in the Gaia commu- 
nity to ensure a proper determination of the ex- 
tinction, testing and comparing different meth- 
ods, including the use of infrared passbands in 
combination with Gaia passbands (Knude & 
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Lindstroem 2007). In addition, Gaia is confu- 
sion limited (BP/RP images of different stars 
are superposed on the FOV) when the total 
star density per transit (sum of the star number 
of both FOVs) is higher than 750,000 stars/sq 
deg. This means that the central inner degrees 
will not be well measured. Bright stars can 
probably still be recovered, but the precision 
will decrease. Simulations are still ongoing 
(M arrese 2008). Infrared surveys are comple- 
mentary to Gaia to understand the structure 
of the inner disk and deal with dust obscura- 
tion. Current astrometric data in the infrared 
are of poor quality, even if great improvements 
were made in the recent past (see 2MASS and 
DENIS). The UKIDSS will cover only a part 
of the sky (7000 sq deg), but it will be much 
deeper (K~ 18 - 19 on the Galactic plane, 
and K~ 21 at higher latitudes over 35 sq deg). 
Interferometric and adaptive optics astrometry 
can be very promising, but they can only pro- 
vide high precision relative astrometry in small 
fields. However, absolute parallaxes in small 
fields can be derived with high precision from 
these observations when enough suitable extra- 
galactic reference sources are detected in the 
field of view. Pan STARRS and LSST optical- 
near infrared surveys covering Northern and 
Southern sky respectively, are expected to ob- 
serve 10 10 stars down to magnitudes brighter 
than 24 mag. reaching an accuracy of about 
3-25, 1-10 mas, respectively on the parallax 
determination. One of the space infrared sur- 
vey which is foreseen in the near future is 
JASMINE, which is expected to cover the 
Bulge and the inner disk regions. JASMINE 
however, will not go very deep (z< 14) ob- 
serving about 10 7 stars with a precision of 0.01 
mas on the parallaxes. All those surveys will 
be of great importance to map highly obscured 
regions where Gaia is not efficient, although 
they cannot reach the same accuracy on as- 
trometry. Finally, since Gaia radial velocities 
will be measured only for G < 16, spectro- 
scopic ground based follow-up with 8m class 
telescopes need to be planned. 

In conclusion, the Gaia mission will pro- 
vide highly accurate astrometric, photometric, 
and spectroscopic measurements for a large 
sample of objects. The high quality of Gaia 



data, especially on astrometry will not be 
reached by any of the planned surveys in the 
near future. Gaia data complemented by infor- 
mation coming from surveys at longer wave- 
lengths, and from follow-up ground based ob- 
servations will offer a great opportunity to un- 
veil the formation and the structure Galaxy. 
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